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Abstract. The well-known regularity lemma of E. Szemeredi for graphs (i.e. 2-uniform hyper- 
graphs) claims that for any graph there exists a vertex partition with the property of quasi-randomness. 
1 We give a simple construction of such a partition. It is done just by taking a constant-bounded num- 

' ber of random vertex samplings only one time (thus, iteration-free). Since it is independent from the 

' definition of quasi-randomness, it can be generalized very naturally to hypergraph regularization. In 

^ this expository note, we show only a graph case of the paper [5] on hypergraphs, but may help the 

, reader to access \5\. 
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1. Introduction 



• The well-known regularity lemma of Szemeredi [12^ (also called the uniformity lemma) was discov- 
er , ered in the course of obtaining the so-called Szemeredi's theorem on arithmetic progressions as an 

affirmative answer of a conjecture by Erdos and Turan. It has been known that this graph-theoretic 
lemma has a plenty of applications in many topics of mathematics and theoretical computer sciences. 

The regularity lemma claims that for any ordinary graph (i.e. any 2-uniform hypergraph) there 
exists a vertex partition with the property of quasi-randomness. Our purpose of this note is to give a 

• simple construction of such a partition. It has several advantages over previously-known methods. It 
I is the case of fc = 2 (i.e. the case of 2-uniform hypergraphs) in [S] which deals with general k. Although 

this expository note is not necessary to read [5] , skimming it may help the reader understand the main 
idea of 5J. 

Remark that our construction had not been known even for the simplest case k = 2 before [5]. 
Although the idea of partitioning the vertex-set randomly has been previously known ([3', fl]), such a 
construction was done always by a constant number of sample random vertices, which thus needs an 
I iteration. The key difference is that ours is constant-bounded but the number of random samplings 

0^ ' is chosen also randomly. The proof can be naturally deduced once the claim is given. 

\l I Recall how the standard proof by Szemeredi constructs the desired vertex partition with quasi- 

■ randomness. Roughly speaking, the partition was constructed by iterated applications of the di- 
I chotomy between energy-increment and structure. That is, initially take an arbitrary vertex partiton 

■ (with a constant number of vertex sets). It can be shown that 

(1) this partition satisfies the required quasi-random property or that 

(2) there must exist another vertex partition finer than this partition such that 
k> \ (2-1) the number of vertex sets increases but is still bounded by a constant and further that 

■ (2-2) a value called 'energy' (or 'index' ) of the finer partition is significantly larger than the 'energy' 
I of the coarser partition. 

They replace the coarser partition by the finer one and repeat this process. Since the energy is always 
less than one from its definition, the repeating process must stop in at most constant time. (Note 
that however the exact time when it stops depends on the structure of the given graph.) The vertex 
partition which the final stage outputs satsifies (1) and thus is the desired partition. 

On the other hand, our construction goes as follows. 
(0') Take a large constant h which depends on e (parameter on how much quasi-random it should be) 
but is independent from (the number of vertices of) the given graph. Further take a length-n integer 
sequence — toq ^ ™i ^ ■ • • ^ run^i, also independent from the given graph. 

(1') Choose an integer < n < h uniformly at random and further choose r7i„ vertices uniformly at 
random from the given graph. 

(2') Each vertex of the given graph is labeled by the adjacency between the vertex and the randomly- 
chosen vertices. 

The resulting partition certainly consists of a constant number of vertex sets (i.e. 2™" < 2™"-i) and 
would be the desired partition with high probability. 

Previous constructions including the usual one by Szemeredi consist of iterated procedures, while 
our construction consists of only one procedure. Furthermore ours is independent from the definition of 
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quasi-randomness, while previous constructions depend on it. Several definitions of quasi-randomness 
have been known. For the case of ordinary graphs, all of them are known to be equivalent. However, it 
has been noted that unlike the situation for graphs, there are several ways one might define regularity 
for hypergraphs (Tao-Vu ^15. pp.455],R6dl-Skokan [10^ pp.1]). Because our construction is independent 
from the definition of quasi-randomness, it can be naturally generalized from ordinary graphs to 
hypergraphs. For the extension to hypergraphs, see [5]. 

The purpose of this note is to present a new construction of the vertex partition (for the case of 
ordinary graphs) and to show that it certainly satisfies quasi-randomness. For the case of ordinary 
graphs, there are several definitions of quasi-randomness but all of them are known to be equivalent 
([2] ). As our definition of quasi-randomness, we will choose the number of induced subgraphs. In 
this paragraph, I will explain why we will use this definition, though of course it is not serious at least 
when considering only the case of ordinary graphs (since they are equivalent). The usual regularity 
lemma firstly defines quasi-randomness by a condition on the number of edges between two sets of 
vertices. Secondly Szemeredi proved the existence of a vertex partition with this quasi-randomness. 
Thirdly and finally, the quasi-randomness on counting induced subgraphs (i.e. our definition) can be 
derived from his quasi-randomness (on edges between two subsets) . The third step is called to be the 
counting lemma and is easy to show, so the second step only is the core of the matter. All of the 
three hypergraph-theoretic proofs of Szemeredi's arithmetic-progression theorem by Rodl et al. [lOl |9] , 
Gowers |4] and Tao [T^ can be considered as generalizations of the above three steps. But unlike the 
case of graphs, the third step (counting lemma) was hard to show for hypergraphs. All of the three 
proofs are different partly because all of them employed different definitions of quasi-randomness, on 
which their regularizations depend. On the other hand, we will not follow the above three steps. We 
will define a probabilistic construction for partitioning the vertices, which will be proven to satisfy 
the condition of our quasi-randomness on counting induced subgraphs. This strategy can be very 
naturally generalized from graphs to hypergraphs in [5] . I believe that this framework of hypergraph 
regularity lemma is convenient for a wide range of applications on hypergraphs. In fact, applications 
of our method are seen in [71 [5] . 

One of the new major technical ingredients in our proof comes from the use of 'linearity of expec- 
tation.' All of the previous proofs use the dichotomy (or energy-increment) explicitly. (See ^ §6], 
[14[ §1].) Namely, when proving the existence of a vertex partition, they define an 'energy' (or index) 
by the maximum (or suprcmum) of some (energy) function. (For example, see [13[ eq. (8)].) It corre- 
sponds to (I18|) in this paper. They consider the maximum value of this energy over all subdivisions in 
each step. If the energy significantly increases by some subdivision, they take the worst subdivision 
as the base partition of the next step. They then repeat this process. Since the energy is bounded, 
this operation must stop at some step, in which case there is no quite bad subdivision, and thus, most 
cells should be quasi-random (dichotomy). 

On the other hand, we (implicitly) take an average subdivision instead of the worst one. The 
definition of our regularization determines the probability space of partitions (subdivisions). We also 
randomly decide on the number of vertex samples to choose. 

With these ideas, we can hide the troublesome dichotomy iterations inside linear equations of 
expectations (|27l) . 

We have two reasons why we will deal with multi-colored graphs instead of ordinary graphs, even 
though almost all previous researchers dealt with the usual graphs. First, our proof of the regularity 
lemma will be natural. Second, we can naturally combine subgraph (black&invisible) and induced- 
subgraph (black&white) problems when we apply our result, while the two have usually been discussed 
separately. 

2. Statement of the Theorem 

In this paper, P and E will denote probability and expectation, respectively. We denote conditional 
probability and expectation by P[- • • | • • • ] and E[- • • | • • • ]. 

Setup 2.1. Throughout this paper, we hx a positive integer r and an 'index' set x with \x\ = r. Also 
we hx a probabihty space {i~ti,Bi,P) for each z e r. We assume that Sli is hnite and that Bi = 
(for the sake of simphcity). Write fl := (rJi)^^^. q 

In order to avoid using measure-theoretic jargon such as measurability or Fubini's theorem, for the 
benefit of readers who are interested only in applications to discrete mathematics, we assume $7^ to be 
a (non-empty) finite set. However, our arguments should be extendable to a general probability space. 
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For applications, $7^ usually will contain a huge number of vertices. We will not use this assumption 
logically in our proof, but it will be important in our theorems that some parameters and functions 
depend on r but independent from any 

In what follows, we will try to embed a small r-partite graph to another large r-partite graph, 
where the r vertex sets of the large graph will be always (17^)^^^. And the large graph and its vertices 
and edges will be denoted by bold fonts (ex. G, v, v', e, • • • ). 

For an integer a, we write [a] := {1,2, • • • ,a}, and (^^j) \Jie[a]Ci) = (Jie[a]U C r||/| = i}. Thus 
([2]) ~ (i)'~^(2) ^ ^'^(2)- When r disjoint sets Xi, i e r, with indices from r are called vertex sets, we 
write Xj := {Y C Uie jXt ||y n Xj | = 1 , Vj G J} whenever J C r. Thus \Xj\ = Hj-gj That is, 
for J ={1,2}, \Xj\ = \X,\\X^\. 

Definition 2.1. [Colored graphs] Suppose Setup \2J\ Given hi and 62, a (5i, 62)-colored (r- 

partite) graph H is a triple ((Xj)jgt, {C i) j ^(^.^^y {'^i i) j ^(^.^^^) where: 

(1 ) each Xi is a set called a 'vertex set, ' 

(2) Cj is a set with at most elements, and 

(3) 7/ is a map from Xj to Cj. 

We write V{H) — IJier^i ^""^ = Cj for I. Each element ofV{H) is called a vertex. Each 

element e e Vi{H) = Xj, I e (^j), is called an (index-/) edge. Thus, when \I\ — 1, an index-I edge 
is just a vertex of H. Each member in C/(iJ) is a (face-)color (of index /). Write H{e) — 7/(e) 
for each I. (So we will not need the notation 7 after this definition.) 

When I — G (2) (i.e. i ^ j) and e = {vi,Vj} S Vj{H), wc dehne the frame-color and total- 

color of e by H{de) := {H{v,), H{vj)) and by H{{e)) = H{e) := {H{e)-H(vi),H{vj)). For a vertex 
Vi G Xi (which is also an index-i edge), we dehne the total-color of v by H{{v)) ~ H{v) ■— H{v)- 
The frame-color of a vertex is always the empty (). Write TC/(i/) := {H{e)\e G Xj = V/(_ff)}, 
TCs{H) := Ujg(.) TCi{H), and TC{H) := TCi(i/)UTC2(i/), where TC means total-color. □ 

As usual, we will call a (61, 62)-colored graph just a colored graph or a graph when we do not 
need to mention values &i, t'2- 

Definition 2.2. [Complexes] A (simplicial-)complex is a (colored x-partite) graph such that: 

(1) for each I E (^j) there exists at most one index-I color called 'invisible' and that 

(2) if (the color of) an edge e is invisible then for any edge e* D e, its color must also be invisible. 

A color is visible if and only if it is not invisible. We simply say that an edge is visible/ invisible when 
its color is so. 

For a graph G on fl, let Sh.G be the set of complexes S such that: 

(1 ) each of r vertex sets of the r-partite graph S contains exactly h vertices, and that, 

(2) for / G (j2]) tiere is an injection from the index-I visible colors of S to the index-I colors of G. 
(When the injection maps a visible color c of S to another color c' of G, we simply write c = c' without 
presenting the injection explicitly.) For S G Sh ,G, we denote by Yi{S) the set of index-I visible edges. 
Write Y,iS) := U/e(')V7(S') and Y{S) := Yi{S)UY2{S). Clearly we have 

\Yi{S)\ < rh and \Y2{S)\ < (^^h^. (1) 

□ 

Definition 2.3. [Partitionwise maps] A partitionwise map (p : IJjWi Ui^« ^'^ ^ map from r 
disjoint vertex sets Wi,i G r, with \Wi\ < 00, to the r vertex sets (probability spaces) fli,i G r, 
such that each w G Wi is mapped into fli. That is , any vertex is mapped to a vertex with the 
same index. We denote by $((Wi)igt) or $(Uigr ^0 of partitionwise maps from {Wi)i. When 

Wi = {{i, 1), • • • , {i, h)} or when Wi are obvious and \Wi\ = h, we denote it by ^{h). A partitionwise 
map is random if and only if each w E Wi is independently mapped to a vertex in the probability 
space rii. Q 

Definition 2.4. [Regularization] Let m > 0. Let G be a graph on fl and let (p G $(m). The 
regularization of G by (p is the graph G / (p on fl obtained from G by redefining the color of each 
vertex v G fli, i G r, by the (1 + (r — l)m) -dimensional vector 

(G/ip) (v) :— (G({v, u})|u = v or, u G flj,i G r \ {z}, is in the range of (p). 
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□ 



Roughly speaking, the color of vertex v in G/(/3 is the information of the color-patterns of size-2 
edges connecting the random vertex samplings and v, together with the original color G(v). (Here a 
size-2 edge means an edge which is not a single vertex but a pair of vertices.) 

Note that edges of size 2 (i.e. not vertices) do not get recolored in this process. Only vertices 
change their colors as the same as in the usual regularity lemma. 

Definition 2.5. [Regularity] Let G be a graph on CI. For c = (cj),7c/ G TC/(G),/ G ([2])- define 
relative density by the conditional probability 

dG(c) := P^en, [G(e) = c/|G(ae) = (cj) jc/]. (2) 

When \I\ — 1, in the above e is a vertex and the conditional part is considered to always hold. (Thus 
for I — {j} and for an index-{j} color Cj £ TCi(G), we have dG(c) = PvgOj [G(v) — Cj], i.e. how 
much portion of the vertices in ^Ij have color Cj. For I — and for c — (Cy;Ci,Cj), we have 

dG(c) = Pv.v,esi,[G(viVj) = c.y | G(v,) = c, and G(vj) = c^]. ) 

For a positive integer h and e > 0, we say that G is (e, /i)-regular if and only if there exists a 
function S : TC2(G) [0,oo) such that 

(i) P^e*w[G(0(e)) = 5(e),VeeV(5)] 

= n dG(5(e)) n idG{S{e))±5{S{e))) , V5 £ 5,,g, (3) 

e6Vi(S) e6V2(S) 

(ii) E,enA3{G{e))] < e/\Cj{G)\, ^ (2) ' 

where a±b denotes a suitable number c satisfying max{0, a — b} < c < min{l, a + b}. 

Denote by reg;j(G) the minimum value of e such that G is (e, h)-regular. q 



Remark. Roughly speaking, (i) measures how far from random the graph G is with respect to 
containing the expected number of copies of the (colored) subgraphs S S Sh,G. The smaller 5 is, the 
closer G is to being random. When (5 = 0, then G behaves exactly like a random graph. On the other 
hand, if we take 5=1 then (i) is automatically satisfied. Condition (ii) places an upper bound on 
the size of 5. Our proof will yield the main theorem even if we replace the right-hand side of (ii) by 
57(|C/(G)|) for any fixed functions gj > 0, for example, gi{x) = 

Our main theorem is as follows. 

Theorem 2.2 (Main). For any r > 2, ft,, 6 = (61,62), o,nd e > 0, there exist an (increasing) function 
m : N ^ N and an integer n satisfying the following: 
If G is a b-colored (r -partite) graph on $7 then 

E„E^[reg,,(G/(^)] <e 

where n is chosen randomly in [0,fi — 1] and where if G $(m(n)) is random. 

Note that m and fi depend only on r, h, b and e and are independent of everything else (including 
O). Since m is increasing, we put rh ;= m(h) > m(n) and get: 

Corollary 2.3 (Regularity Lemma). For any r > 2, h, b = (61, 62), and e > 0, there exists an integer 
rh such that if G is a b-colored (r-partite) graph on then for some integer m < fh, we have 

IE>pe*(m)[reg;,(G/(^)] < e. (5) 

In particular, when holds, if we pick a map ip G ^{m) randomly then with probability at least 
1 — ^/e, we have regi^ (G/ip) < ^/e, thus G/ip is (y^, h)- regular. 

It is important that the above integer m is bounded by a constant fh independent from G but the 
exact value of m itself depends on G. Note that, in (cannonical) property testing, the exact value m 
is also independent from G. This is a new critical idea which has never been previously while some 
had felt that property testing and graph regularization seem to have a close relation (ex. [1]). 

Of course, we can rewrite the above results for non-partite graphs. 

3. Proof of the Main Theorem 

Before we proceed with the proof of the Main Theorem, we will need to establish two lemmas. 
We admit that they may appear a bit technical and unmotivated at this point, but their use will be 
clearer once we see how they are used in the main proof. 
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3.1. Two lemmas and their proofs. 

Definition 3.1. [Notation for the lemmas] Let G be an (r -partite) graph on ft. For two edges 

G BG 

e, e' e rtj, we abbreviate G(e) = G(e') and G{de) = G{de') by e w e' and e w e', respectively. 
An /i-error function of G is a function S : U/g(') TC/(G) [0, oo) satisfying ^ for all S E Sh,G,. 
Denote by |- • -I the Iverson bracket, i.e., it equals 1 if the statement in the bracket holds, and 



otherwise. 



□ 



Lemma 3.1 (Correlation bounds counting error). For any graph G on ft and for S £ Sh,G, we have 



,$(,)[G(0(e)) = 5(e),VeeV2(5)|G(0(«)) = 5(«),V«eVi(5)]- [] daiSie)) 

eeV2(S) 



< W2(S)\ max 

0#DCV2(S) 



'0e*(/i) 



n(IG(0(e))=5(e)I-dG(5(e))) 



eeD 



Gi(j){v)) = S{v), Vf e Vi(S') 



Proof : [Tool: Nothing] We will prove this by induction on |¥2(5)|. If |¥2(S')| < 1 then the 
statement is trivial, since in this case, the expression on the left-hand side of the inequality is 0. 
So let us assume that |V2(S')| > 2 and that the result holds for all smaller values of |V2(S')|. Let 
(ig '■— <ic,{S{e)), and let rj be the maximum part of the desired right-hand side. Then for D := ¥2(5*) 
we have 



[-77,77] 3 IE0e$(;i)=$(y(5)) 



n (IG(0(e)) = 5(e)]-dG(5(e))) 

eeV2(S) 



E, 



0e*(/i) 



n IG(0(e)) = 5(e) 

eGV2(S) 



+ J2 (Yli-deUE^eHh) 

0=4DCV2(S) Veen / 



G{<j,{v))^S{v),yveYiiS) 
G{(l){v))=S{v),yveYi{S) 

G{cb{v))^S{v),yveYi{S) 



n IG(0(e)) = S{e) 

eeV2iS)\D 

expanding the product and using the linearity of expectation and the definition of de ■ Now we will 
focus on second term above. Since the value of |G(0(e)) = S'(e)| is or 1, we can replace E by P, and 
consequently, apply the induction hypothesis (since D is nonempty). Consider a complex with 
^2(3-) = ¥2(6") \ -D by invisuaUzing the edges in D of S. 

Using the inductive hypothesis for complex in the place of S, we rewrite the second term and 
obtain 



E, 



n IG(<^(e)) = 5(e) 

sGV2(S) 



I.H. 




G(</.(7;)) =5(«),V«eVi(5) 
± |V2(5-)|7; 



n ^ 

eeV2(S)\D 



n 4] ±|V2(5-)|77| (n(-i)) 

£V2(S) / / 0#DCV2(S) Veen / 



n de ±(|V2(5)|- 1)77 I ((1-1)1^^(^)1 
^eeV2(S) / 

n 4 1 ±|V2(5)|77. 

ieeV2(S) / 



± V 

±77 (•.■|4|<i) 
±77 (■.•|V2(5)| > |V2(5-)|) 



□ 



We will use the following form of the Cauchy-Schwarz. 

Fact 3.2 (Cauchy-Schwarz inequality). For a random variable X on a probability space Q if an 
equivalent relation on is a refinement of another equivalent relation ^ on Q then 



[X{lj)\uj » LUo]f > E^„eo (E^eo[^(^)|'^ - ^0])' 



(6) 
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Proof; By the Cauchy-Schwarz (i.e. E[X2]E[y2] > [E[XY]f), we haveE^^ (E„[X(tj)|w w luo]Y 



{E^[X{uj)\uj^uj']y 



UJQ 



uJo 



cs 
> 



E^„ (E^4l • E^[X{ij)\Lj » u;']\uj' ~ wq]) = (E^[X(w)|w ~ wq]) 
With this fact and Definition 13.11 we next tackle 



□ 



Lemma 3.3 (Mean square bounds correlation). Let h and m be positive integers and G an r -partite 
graph on the vertex set Q. Let S € Sh,G arid let Fe : C/(G) [—1,1] be a function for each I E (2) 
and for each e £ Yj{S). For any I £ (2) and G V/(5), we have 

2 



E, 



<t>e<S'(h) 



n ^^w^)) n iGiHv))^s{i 



< 



E^^enAi ^.enAFea (e) lG{de) = 5(9eo)l|e 




n dG(^(^^» n dG(5(^;» 

^tieVi(s) / \ \t)eVi(s),D^eo 

where (j),ip are random and where we abbreviate i^e(G(e)) by Fe{e). 

In particular, if we suppose < nueVi(S) v^eo '^d^i'^)) T^-S-; ™ large) then 

- \ 2 

eeV2(S) 



(7) 



G((/i(?;)) = 5(v)Vw e Vi(5) 



< 2E;p£$(m^)Ee*er2^ I 



dG/<p 

Beri, [i^eo (e) |e « e* 



|G(ae*) = 5(aeo)]. 



(8) 



Proof : [Tools: Cauchy-Schwarz, Fact [SJ] Fix la G (2) and eo G V/o(5). For e $(F(S') \ eo) 
and for eo G JI/q, we define the (extended) function (J)'-'^"'' G $(^(5*)) such that: 

(i) each w G eo is mapped to the corresponding v G eo with the index of v, and that, 

(ii) each v G 1^(5") \ eo is mapped to (f>{v). 

(That is, when we have a map from all but two vertices eg, we extend it by assigning two vertices 
Co to two vertices Bq. ) For an m-tuplc of maps ip = {^i)i£[m] with ipi G ^(V{S) \ eg), we define an 

equivalence relation ^ on O/^ by the condition that 

e^ e' if and only if ^f)(e) § ^(e) Ve G V(S') \ {eo}, Vi G [m]. (9) 

(Note that 1^(5*) \ eo is a vertex set while V(5) \ {eo} is an edge set. Since the right-hand side clearly 
holds for e with e n eo = 0, it is enough to check only the edges e G V(5) with |e n eo| = 1.) 
Let S'(i), • • • , S'(™) and e[,^', • • • , e[,™' be m copies of 5 and eo. For = i^^)^e[n^] with ip, G $(V^(5(*))\ 
e^'^), let if* G $(m/i) = <P{V{S'^^'>)0 ■ ■ ■ OV{S^"'^)) be an extended function of ^,'s, i.e., ip*{v) = ip^{v) 
forall?;GT/(5W)\ eg'' , i G [to] . Then it is not hard to see that 



ewe mrplies e ~ e . 



(10) 

dG/ip' G/i/3* 

(To see this, observe that if /o = {1, 2} and {vi, V2} ~ {v'l, v'2}, or equivalently Vj « ^' j{j ~ 

1,2), then {vj,Lp*{v)} § {w'j,Lp*{v)} for all v G T^(S'W) having no index j (i.e. v G V,'(5«),/ 7^ j). 

Since ip*{v) = ip,{v) if w ^ e^\ we see y^p^'''^" (e) § ^^^'''^''''^^^ (e) for aU e G ¥2(5*) \ {eo} with 
|e n eol = 1, implying ^ by dH).) 

Let F^Je) :== Fe„(e)|G(ae) = 5(aeo)l and let 

i^*(^):= n ^^W^)) n IG(<^(«)) = ^(«)1. 
Then, since |- • -p = |- • •], the left-hand side of (O becomes 



i^eo(^(eo)) n i^e (0(e)) • IG(5(0(eo))) = 5(9eo)F [] IG(0(t^)) = ^(t^) 

eGV2(S)\{eo} «6Vi (S):i;^eo 

2 

(by the definition of F*^ and F*) 



E0g$(/i 

(E^e*(/.) K Weo))^^*(<^)]) 
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E, 



F;(eo)F*(0(-°)) 



eoer27o.0e*(y(S)\eo) 

(since the two expectations are taken over random choices of 2 + {rh — 2) = rh vertices in V{S)) 

2 



V={Vi)iel^]e{'S>iViS)\eo)y^^eoen,„ 



(eo)\ 



^V={Vi)iel^]e{'S'iViS)\e„))"-^eoeni„ 



E, 



(since for any random variable X and the equivalence classes Ci by ~, 
Ee„Ee[X(e)|e ^ Gq] - E^^^o[eo e Q]Ee[X(e)|e e a] = E,,[X{eo)] ) 



^V=(¥'i)ie[He(*(^(S')\eo))'"IEeoenj„ 



Eeen.„K(e)|e^eo]E,,[„,][^^*(^^))] 



(since F*{ip'i') = i^*((^r ) when e ^ Gq) 



< E^Eeo 



Eeej^.jF; (e) |e ^ eo] 



IEip=(l^;);Eeo 



(by Cauchy-Schwarz) 



< 



3enrJi^;o(e)|e w eo_ 



E;^^(^^)^Eeoer2j^j 



E,;,,.,[„,][F*(^^V*(^}-0] 



So, the first term in this last line is the first term in our desired inequality. We now focus on the 
second term. Since |i^*(-)| < 1, we have 

E,;,,.,[„,j[F*(^^y*(^(^°))]" 



E, 



m — 1 



m 



< Ee„pO, E 



■eo e r2 Jo ^'/^ 1 > e * ( 1/ ( S) \eo ) 



-E, 



i>evi(s) 



(by the definition of F* since {F^l < 1) 



Looking at the second term first, this can be written as 

— Y[ IP'¥>G$(v(s)) [G{(p{v)) = S{v)] (since (p maps all v S 4'(Vi(S')) independently) 



t>eVi(s) 



= — Jl dG{S{v)) (by the definition of do (5'(u)) ) . 



«eVi(S) 

In a similar way, we can interpret the first term as computing the probability that 2 + 2{rh — 2) 
random (visible or invisible) vertices chosen independently will have vertex colors in G which match 
those of their corresponding vertices in S. This probability can be written as 

= n dc,{s(v)r n dG(^(^;». 



Putting all these observations together, the proof of the first part of Lemma 13731 is complete. 
Next, we show the last sentence of the lemma. The left hand side of dH) is at most 



E 



E 



cl>e'S>ih) 



eGV2(S) i>GVi(5) 

n ^-w^)) n m^v)) = s{v)i 

eeV2(5) i>GVi(S) 



/P^eHh) [Gi<l>{v)) = S{v)yv e Vi(^)] 



/( n dG(^(«» 
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f E^e$(,„^)Ee.ej7,[fEeeo,[^^eo (e) lG{de) = S{deo)j\e e*]) |G(5e*) = S{deo)] 



2 



/ Jl dG(5'(w)) (-.-e ^ e* and G{de*) = S{deo) imply G{de) = S{deo)) 

\f6Vi(S) / 

< E^e$(„,„)Ee.ej^,[(^Eeeo,[Fe„(e)|e''~^e*]j |G(ae*) = 5(9eo)] 
1 + 1 ( n dG(5(.)) 



The assumption on m now completes the proof of ([5]). q 



3.2. The body of our proof. 

Definition 3.2. [Notation for the proof] Writeci{G) := max^g^-v^ |C/(G)| fori — 1,2. Forb — (&i,62) 

and an integer m, we write B{b,m) := m))jg[2] where Bi{b,m) := bi-b^2 and B2{b,m) := &2- 

Recalling the definition of regularization G/f, it is easy to see that if G is a b-colored graph then 

Cr{G/ip) < B^{b, m), Vi = 1, 2, G $(m). (11) 

(Tf i = 2, it is obvious since regularization does not recolor any size-2 edge. Ifi = l, the new color of 
a vertex is determined by its original color and by the colors of the edges connecting the vertex and 
the (r — l)m random vertices. ) q 

Suppose we are given some fixed h> 1, e > and b. Our job will be to define suitable functions m 
and 6, and a suitable integer n, so that ([3]) and Q are satisfied. This we now do. 

• [Definition of the sample-size functions] Set 6^(0) ■= 'Ti(O) :— 0. Define fi^j^^^ = n to 
be large enough so that 



C V2( i ^V'^'"' and :^ { ^\ . (13) 



where 



(These expressions will appear in ([25|) and ((30|) .) 

We will define the function m recursively as follows. Suppose that m(n) has been defined for some 
value of n > 0. Let 

rh 



I (r— l)m(n) " 

M I 1 . (14) 



/ei 

(We will use the form (fH|) only once in Define m(rt + 1) so that 

A ^(r-l)m(«)\ 

m(?i + l) > m(n) + Mh = m{n) + { ^ ^ ^ h. (15) 

\ ^/eT / 



Next, we define the error function 5. 

• [Definition of the error function] For ip £ ^{m{n)), we write G* := G/(p and we define the 
error function S = 6h,e.G' inductively as follows. 
First, define 

(5(c) and r;(c) := for all c G TC/(G*) with / e r j = t. (16) 
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Before defining (5(c) and r]{c) for c G TC2(G*), we define 'bad colors' BAD C TC(G*). For / e (^j), 
we define BAD/ by the relation that c = (cj)jc/ G BAD/ if and only if 

dG*((cj)jc/*) < ^/iT/|C/.(G*)| for some /* with ^ /* C /. (17) 
Define BAD U/e( ' ) BAD/. A fead edf^e will mean a visible edge whose color is bad. 
For c = (c,/),/c/ e TC2(G*), we define, using M and C of (|I31) and (HH), 

E^,e$(Mh)Ee.ej^,[fPeeJ7.[G*(e) = c/|e e*] -dG.(co') |G*(ae*) = (cj)jc/Kl8) 




if ce BAD/, 
otherwise. 



(19) 



First, we show that with the above specified choices for m,n and S, ([3]) is satisfied. 
• [The qualification as an error function] Clearly it is enough to show that 

F0e*(h)[G*(0(e)) =5(e),Vee V(5)] 
= n dG.(^(e)) n (dG.(5(e))±5(5(e))) (20) 

e6Vi(S) eeV2(S) 

for any S € 5/,.,g* ■ Furthermore without loss of generality, we can assume that 

5(e) ^ BAD for any e e V(5). (21) 

(Indeed, we can show this by the induction on the number of bad edges in S. Let a complex S be 
given where S contains a bad edge e*. Firstly we suppose that there exist no bad vertices and thus e* 
contains two different vertices (which are not bad). By the induction hypothesis, (jSO]) holds for the 
complex S* obtained from 5* by recoloring e* in the invisible color. Equality ([20]) means that the real 
number the left hand side suggests belongs to the interval which the right-hand side suggests. Denote 
by this interval. Again we reconstruct S from S* by recoloring some invisible edges in the 

original bad color. By this process from S to S* , the left hand side of pp)) will not increase (probably 
decrease because of added visible edges e*) and the right-hand side will suggest interval [0,p+] because 
dG.{S{e*))±6{S{e*)) = [0,1] by Then ^ holds also for S. Secondly we suppose that the e* 

consists of a single bad vertex v*. Then we recolor not only v* but also all edges containing v* in the 
invisible color. The same argument can be applied. ) 

Fix such an 5 e Sh.G*- For any e E YjiS) with J C r, it follows from (HIl), (HZl) and HH) that 



dG»(^(e))> \Cj(G*)\ (if 1^1 <2) and<5(5(e))-0(if |JH1). (22) 



Using (|lip . and (P^ . a straightforward computation gives 



1 iII},{Tl 

— < 
M 



ci(G*) 



\ |Vi(S)| 122} 

) < n dG.(5(z;))< n dG.(5(^)) 



(23) 



for any eo G ^2(8). For any choice of 9^ £> C ¥3(5), we define S' G Sh,G' so that ¥2(8') = D and 
5"(e) = S'(e)Ve G D and that S"(w) = S{v) Vw e Vi(5) = Vi(S"). Now,' applying Lemma [Sj for S' 
with Feie) := [G*(e) = S{e)j - dG.(5(e)), we have 

-1^2 

E^e^W n (IG*(<^(e) = 5'(e)]-dG.(5'(e))) 

eeV2(S') 



G*{<j){v)) = 5'(v),Vt. e Vi(S") 



E 



n(IG*(0(e) = 5(e)l-dG-(5(e))) 



eS-D 



G*(0(i;)) = 5(«),Vz;e Vi(5) 



E 



0e$(/i) 



n ^^w^)) 



Le6_D 



G*(</)(i>)) = 5(«),V«e Vi(5) 



< 2 min Eyg$(M/i)IEe* efii [ Eeefj^ [Fg^ (e) | e 

eoG-D ' 



\G*{de*) = S{deo) 



2 min E^e$(M,oEe.eo,[( Eeeo,[IG*(e) = 5(eo)l| e ^V^ e*] - dG.(5(eo))] ) | G*(9e*) = 5(9eo) 
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(by the definition of Fe{e) since do. (S'(e)) does not depend on e) 



2 • min ?7(5(eo)) 

eoG-D 



< 2 • max r/{S(eQ)). 

eoGV2(S) 

Now, choose an bq G ^2(5) which maximizes ?7(S'(eo)). It then foUows from Lemma [01 that 
P0e*(ft)[G*(0(e)) = S{e), Ve e V2(5)|G*(0(z;)) - V« £ Vi(5)] 

n dG*(5(e))±V2|V2(^)|v/?7(5(eo)) 

eGVsCS) 



(24) 



dp 

m 



d.-(5(e.),±. V^0"^VW<5^ 



t) n do-lSW) 

/ eGV2(S),e^^eo 



(2Vir/c2(G*))l^^(^)|- 

dG*(5'(e)) (■.■ C2(G*) = C2(G) < 62) (25) 



eGV2(S),e7^eo 



eeV2(S) 



(26) 



For any 5 S Sh.a-', (PO]) holds, and we have shown that (5 satisfies ([3]). 
FinaUy we turn to showing that 6 satisfies 

• [Bounding the average error size] For I E (2) , it follows from the lineality of expectation that 

^ 2 

^ne[o,ft-i],ipe*(m(n))IEeenj iV 7?(G*(e))] ^ 



< En,^EeenAv{G* (e))] (by Cauchy-Schwarz) 



E„,^,eVe*(M/.)lEe*ei2.[ PeeS2,[G*(e) = G*(e)|e « e*] - dc- (G*(e)) | e* « g; 



dG' 



dG' 



IG*(e) = c/] E [ P [G*(e) = c,|e""«^ e*] - P [G*(e) = c,| e e]) ) | e* "5 e] 



ciECiiG 



< E„,^,e V.e4(]PeeJ^,[G*(e) = c/|e^''«^ e*]-Peej^,[G*(e) = c/|e''~ e])j 16*"^ e] 

C7ec/(G') \ / 



9G* 



V E - 



aG*/v 



aG* 



E^.^,4|Peej^,[G*(e) = c,|e « e*] ) | e* « e] + ( Pe^o, [G*(e) - c,| e « e]) 



aG* 



aGVv 



aG* 



2E^, [ ( Peej^, [G* (e) - c, I e « e*] ) | e* « e] ( Peeo. [G* (e) = c, | e « e] ) 



aG* 



V E - 



|C/(G)|E,,E„ 



(*) 

< ^2Eo<,i<fiEe,cj 



dG'/ip 



dG' 



dG' 



E,.3.[ Peeo.[G(e) = c,|e « e*] |e* « e] - PeeJ^, [G(e) = c/| e « e]) 



(since r/"^ is a refinement of w ) 



E 



a(G/i^)/.p _ 



ip'e*(A//i) 



[ Pe[G(e) = c/|e e] ] - Pe[G(e) = c/|e ^« e]) 



diG/v) 



E 



aG/0 



0"e$(m(„+i))[ Pe[G(e) = c,|e « e] ] - E^e$(„(„))[ Pe[G(e) - c,| e « e] ] 



aG/0 



, h— 1 

n ^-^ 

n=0 



dG/tf 



IE0G*(m(«+i))[ Pe[G(e) = c/|e w e] ] -E0g<[,(„(„))[ Pe[G(e) = c/|e « e] ] 



aG/0 



^E- 
n 



aG/0 



IE0e*(m(fi))[ Pe[G(e)-c/|e « e] ] - E^e$(„(o)) [ Pe[G(e) = c/| e « e] ] 
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< 

h 



(since the sum telescopes!) 

&2 



where in (*) above we use Fact 13.21 and the property that after n is chosen, it foUows from (fT5|) that 
m{n+l) > Mh + m{n) and that if D ip{lDi)U(p'{B) (where </>"(©), (p{B),(f)'{n) denote the ranges 

dG/4>" _ . . d{G/<p}/<p' _ 

of those functions) then ewe imphes e « e. 
Now, for any I e (2) , it follows from (fT9|) that 

m , 

[^(G*(e))] < E„,^Eeej^,[CV?7(G*(e)) +Peej^, [G*(e) G BAD/] • 1] 
CE„,^Ee[v/?7(G*(e))] + E„,^ [Pegs^, [G*(e) e BAD/]] 



nzmzi z^; 



^Peei.,,[dG*(G*(e)) < ;28) 



Ljc/ 

However, it is easy to sec that for any t > 0, we have by the definition of do* 

Pe [dG.(G*(e)) < r] = Pe [Pe' [G*(e) = G*(e')|G*(ae) = G*(ae')] < r] < r. (29) 
Hence, using ([28]) and ((29)) . we can write 



E„,^Eees^,[(5(G*(e))] < Cy^+E„,^ 



E 



,,,,,C.(G*)|- 



V2y 



e 



To show that the expectation of the regularity is small, we compute 

E„,^[reg(G/^)] < E„.^[ max |CKG/<y5)| Eeeo,[<5(G*(e) 

^e([2i) 

< E,,^[^ |C/(G/</p)|Eeeo,[<5(G*(e)) 



as required. This completes the proof of Theorem 12.21 □ 
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